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Foam: A General purpose Monte Carlo Cellular Algorithm* 
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A general-purpose, self-adapting Monte Carlo (MC) algorithm implemented in the program Foam is described. 
The high efficiency of the MC, that is small maximum weight or variance of the MC weight is achieved by means 
of dividing the integration domain into small cells. The cells can be n-dimensional simplices, hyperrectangles 
or a Cartesian product of them. The grid of cells, "foam", is produced in the process of the binary split of the 
cells. The next cell to be divided and the position/direction of the division hyperplane is chosen by the algorithm 
which optimizes the ratio of the maximum weight to the average weight or (optionally) the total variance. The 
algorithm is able to deal, in principle, with an arbitrary pattern of the singularities in the distribution. 



Introduction and motivation 

For the problem of function minimalization of 
the arbitrary user-function one applies MINUIT 
or other program from the NAG library. One 
finds also many general purpose programs for in- 
tegration of "arbitrary" integrand. The above 
general-purpose tools work, in principle, for a 
very wide range of the user-functions. For multi- 
dimensional Monte Carlo simulation problem, 
that is for the problem of generating randomly 
points according to n-dimensional distribution, 
there is precious little examples of the general 
purpose Monte Carlo Simulators (GPMCS), that 
is programs which work (in principle) for an ar- 
bitrary integrand. 

Inevitably the GPMCS has to work in 2 stages: 
"exploration" and "generation". During explo- 
ration GPMCS is "digesting" the entire shape 
of the n-dimensional distribution p(xi, x%, ...x n ) 
to be generated and memorizes it as efficiently 
as possible using all CPU power and mem- 
ory available. Obviously, for the memorized 
p'(xx, X2, ■■■x n ) a method of the MC generation 
of the points x exactly according to p'(x), has to 
be available. The quality of the distribution of 
the weight w — p/p' for events in the generation 
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(small variance, good ration of maximum to av- 
erage, etc.) is determined by the algorithm of the 
exploration. In other words, the quality of the 
"target weight distribution" in the latter genera- 
tion is determining/driving the algorithm of the 
former exploration. 

At the time of writing up this contribution a 
detailed description of the algorithm and the pro- 
gram is available in ref. 0. The original, less 
advanced version was described in ref. The 
other old and new efforts in the area of general 
purpose MC methods see refs. §,[|H@,||1 
Cellular algorithm of the Foam 
The most obvious method to minimize the vari- 
ance (or maximum weight) of the target weight 
distribution in generation is to split integration 
domain into many cells^j, such that p{x) is approx- 
imated by constant p'{x) within each cell. This 
is a cellular class of general purpose MC algo- 
rithms^] which always involves a kind of cellular 
exploration of the user defined distribution. 

The obvious questions are: what kind (shape) 
of cells to use and how to cover the space with 
cells? In Foam algorithm/program three types of 
cells are used: pure simplices, pure hyperrectan- 
gles and Cartesian products of them. Cells of this 
kind can be parametrized rather easily and the 
related computer memory consumption is not ex- 



2 This general idea is at least 40 years old. 
3 "Stratified sampling" , used in the literature, has a nar- 
rower meaning. 
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cessive. 

The system of cells can be created all at once 
(like in VEGAS §) or in an evolutionary way, by 
the "splitting process" . The Foam algorithm does 
the binary split of cells. Next cell to be split is 
chosen by examining the "target weight distribu- 
tion" , that is of the event generation in the latter 
phase. The binary split provides automatically 
the full coverage of the space, simply because the 
primary "root cell" is identical with the entire in- 
tegration domain and there is never any mismatch 
between parent cell and two daughter cells. 

Another important issue is the question of the 
variance reduction versus maximum weight re- 
duction. In construction of the Foam algorithm 
we have put most effort on the minimization of 
the ratio of the maximum weight to the average 
weight u> m ax/(w}. This parameter is essential, if 
we want to transform w-tcd events into w = 1 
events, at the latter stage of the MC generation. 
Minimizing maximum weight is not the same as 
minimizing variance a = U (w 2 ) — (w) 2 . Usually 
minimizing w m ax is more difficult. In Foam mini- 
mizing variance is also implemented and option- 
ally available. It can be useful in the case when a 
MC with w-ted events is acceptable. In fig. |l| we 
show two examples of the weight distribution evo- 
lution in the Foam, when adding more and more 



cells. 

In the Foam algorithm we define two auxil- 
iary distributions p'(x) and pi OS s(x) related to 
integrand p{x). Both are constructed together 
step by step, simultaneously with the construc- 
tion of the foam of cells, in the exploration pro- 
cess. When our ultimate aim is to minimize w ma . x , 
we define 

p'(x) = max p(y). for x G Cellj, 

yeCell; 

Rioss = J d n x [p'{x) - p(x)} = J d n x pioss{x). 

here pi OS s is the difference between the "ceiling 
distribution" p' and the actual distribution p from 
which it is derived. The rejection rate in the final 
MC run is just proportional to J pi OS s(x) by con- 
struction, i.e. the rejection rate = Ri oss /R. Also, 
Pioss (x) has a clear geometrical meaning, see be- 
low. When our principal aim is to minimize the 
ratio of the variance to the average of the weight, 
<j/{w), in the final MC generation, we are led to 
the following definition: 

p'(x) = y/(p 2 )i, for x e Cell u 

pioss(x) = \J (p 2 )i - (p)i, for xeCelli. 

The average (...}/ is over the 7-th cellQ. The ratio 
o~ / (w) in the final MC generation is a monotonous 

4 See ref. [hi for a detailed derivation of the above prescrip- 



ascending function of Ri OS s = / pioss{x)dx n - 
hence, minimization of Ri OS s is equivalent to min- 
imization of a/(w). 

The basic rules governing binary split of a cell 
is that each split of a cell: ui — > us' + 10" should 
decrease the total Rioss and the decrease should 
be as big as possible. How to get the best total 
decrease ARi oss ? 

(1) For each next cell split we choose a cell with 
the biggest Ri oss . 

(2) Position/direction of a plane dividing a par- 
ent cell into two daughter cells is chosen to get 
the smallest total Ri OS s- 

The actual procedure relies on the small MC 
exercise done for each single cell during its ex- 
ploration, in which events are generated with the 
flat distribution, weighted with p and projected 
onto n (simplical case) or n(n + l)/2 (hyperrect- 
angular case) of the cells. Resulting projection 
histograms are then analysed and the best "divi- 
sion geometry" is found, for which the estimate 
of AR[ oss is calculated. How the geometry of 
the division algorithm looks like? In case of a 
n-dimensional simplex defined by n + 1 vertices, 
a pair of vertices Xi and Xj is chosen and a new 
vertex Y is put somewhere on the line in between: 
Y = \xi + (1 - X)xj, < A < 1. Two daugh- 
ter simplices are defined with two list of vertices: 

(x\ , . . . , Xi— 1 , Y : Xi-\- \ , ■ ■ ■ , Xj — i , Xj , Xj-\- \ , . . . , Xn j £n+l ) 

and (xi , ■ ■ ■ , Xi—\ , x^+x , . . . , Xj—i , Y ^ xj+i , . . . , x n , x n +\ ) . 
Now, how do we choose a pair (i, j) and the value 
of A? A short sample of the MC events (100-1000) 
is generated 6 cell. Each MC point is projected 
(see fig. ||) X — > Y onto an edge ^ j- For 

each the dN/dX is histogrammed, and its 

"loss" functional Ri oss is estimated. The optimal 
edge with the biggest loss is selected. For 

the optimal edge the cell division ration A is de- 
duced from its histogram (A is always a rational 
number, n/N bin ). 

In fig. U we show several examples of the evo- 
lution of the 2-dimensional foam of cells in case 
of simplices (triangles) and rectangles for up to 
2500 cells. 

The important practical question is whether 
hyperrectangular cells or simplical cells are beter 
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tion. 



Figure 3. Examples of the 2-dimensional foam. Num- 
ber of cells from 10 to 2500. 



as a building blocks. There is no simple answer. 
Simplices are limited to low dimensions n < 6 be- 
cause calculation of increasingly large number of 
determinants leads to much CPU time consump- 
tion. In addition, for simplices memory consump- 
tion is ~ 16n Bytes/Cell while for hyperrectan- 
gles memory consumption can be limited to below 
50Bytes/Cell independently of n. See ref. [Q for 
details. Experimenting with many testing func- 
tions has also shown that quite often hyperrectan- 
gles provide final MC efficiency as good or some- 
times even better than for simplices. 

The method of cell division relying on the anal- 
ysis of the projection histograms requires many 
function calls in the cell exploration. The over- 
all efficiency of the final MC is best for large 
number of cells. That points towards necessity 
of the large number of the function calls dur- 
ing the exploration, which may lead to excessive 
CPU time consumption. The following efficient 
trick was found to limit number of the function 
calls in the cell exploration: During MC explo- 
ration of a given new cell we continuously mon- 
itor the number of accumulated effective W = 1 





Figure 2. Geometry of the split of the 3-dimensional simplex cell. 
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when N e ff /ribin > 25, where n^m is the number 
of bins in each histogram used to estimate the 
best division direction/edge and parameter. In 
this way the increase of N samp is not wasted for 
cells inside which the integrand is already varying 
very little. 

Programs 

At present the main development version of the 
Foam implementation is available in the C++ lan- 
guage. Two other versions written in Fortran 77 
are also available. They are functionally equiva- 
lent to version in C++0. 

Conclusions 

Foam algorithm is a versatile adaptive, general- 
purpose type of an algorithm based on the cellular 
division of the integration domain. The geome- 
try of the "foam of cells" is rather simple, cells of 
simplical and/or hyperrectangular shape are con- 
structed in the process of a binary split. It works 
in principle for arbitrary distribution - no as- 
sumption of factorizability as in VEGAS of Ref. §]. 
Encoding cells with memory- efficient methods al- 
lows up to ~ 10 6 cells to be built in the computer 
memory of a typical desktop computer. 
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